AustralieFrV1, Main, Exploration, bibRecord, 00AB73

Noise adaptive speech recognition based on sequential noise parameter estimation

Identifieur interne : 00AB73 ( Main/Exploration ); précédent : 00AB72; suivant : 00AB74

Noise adaptive speech recognition based on sequential noise parameter estimation

Auteurs : KAISHENG YAO [Japon] ; Kuldip K. Paliwal [Japon, Australie] ; Satoshi Nakamura [Japon]

Source :

Speech communication [ 0167-6393 ] ; 2004.

RBID : Pascal:04-0276268

Descripteurs français

Pascal (Inist)
- Reconnaissance parole, Estimation paramètre, Estimation séquentielle, Réduction bruit, Algorithme EM, Processus non stationnaire, Bruit additif.

English descriptors

KwdEn :
- Additive noise, EM algorithm, Noise reduction, Non stationary process, Parameter estimation, Sequential estimation, Speech recognition.

Abstract

In this paper, a noise adaptive speech recognition approach is proposed for recognizing speech which is corrupted by additive non-stationary background noise. The approach sequentially estimates noise parameters, through which a non-linear parametric function adapts mean vectors of acoustic models. In the estimation process, posterior probability of state sequence given observation sequence and the previously estimated noise parameter sequence is approximated by the normalized joint likelihood of active partial paths and observation sequence given the previously estimated noise parameter sequence. The Viterbi process provides the normalized joint-likelihood. The acoustic models are not required to be trained from clean speech and they can be trained from noisy speech. The approach can be applied to perform continuous speech recognition in presence of non-stationary noise. Experiments conducted on speech contaminated by simulated and real non-stationary noise show that when acoustic models are trained from clean speech, the noise adaptive speech recognition system provides improvements in word accuracy as compared to the normal noise compensation system (which assumes the noise to be stationary) in slowly time-varying noise. When the acoustic models are trained from noisy speech, the noise adaptive speech recognition system is found to be helpful to get improved performance in slowly time-varying noise over a system employing multi-conditional training.

Affiliations:

Australie, Japon

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 004F03
to stream PascalFrancis, to step Curation: 001211
to stream PascalFrancis, to step Checkpoint: 004A34
to stream Main, to step Merge: 00B861
to stream Main, to step Curation: 00AB73

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Noise adaptive speech recognition based on sequential noise parameter estimation</title>
<author><name sortKey="Kaisheng Yao" sort="Kaisheng Yao" uniqKey="Kaisheng Yao" last="Kaisheng Yao">KAISHENG YAO</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>ATR Spoken Language Translation Research Labs</s1>
<s2>Kyoto</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<wicri:noRegion>ATR Spoken Language Translation Research Labs</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Paliwal, Kuldip K" sort="Paliwal, Kuldip K" uniqKey="Paliwal K" first="Kuldip K." last="Paliwal">Kuldip K. Paliwal</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>ATR Spoken Language Translation Research Labs</s1>
<s2>Kyoto</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<wicri:noRegion>ATR Spoken Language Translation Research Labs</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>School of Microelectronic Engineering, Griffith University</s1>
<s2>Brisbane</s2>
<s3>AUS</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Australie</country>
<wicri:noRegion>Brisbane</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Nakamura, Satoshi" sort="Nakamura, Satoshi" uniqKey="Nakamura S" first="Satoshi" last="Nakamura">Satoshi Nakamura</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>ATR Spoken Language Translation Research Labs</s1>
<s2>Kyoto</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<wicri:noRegion>ATR Spoken Language Translation Research Labs</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">04-0276268</idno>
<date when="2004">2004</date>
<idno type="stanalyst">PASCAL 04-0276268 INIST</idno>
<idno type="RBID">Pascal:04-0276268</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">004F03</idno>
<idno type="wicri:Area/PascalFrancis/Curation">001211</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">004A34</idno>
<idno type="wicri:explorRef" wicri:stream="PascalFrancis" wicri:step="Checkpoint">004A34</idno>
<idno type="wicri:doubleKey">0167-6393:2004:Kaisheng Yao:noise:adaptive:speech</idno>
<idno type="wicri:Area/Main/Merge">00B861</idno>
<idno type="wicri:Area/Main/Curation">00AB73</idno>
<idno type="wicri:Area/Main/Exploration">00AB73</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Noise adaptive speech recognition based on sequential noise parameter estimation</title>
<author><name sortKey="Kaisheng Yao" sort="Kaisheng Yao" uniqKey="Kaisheng Yao" last="Kaisheng Yao">KAISHENG YAO</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>ATR Spoken Language Translation Research Labs</s1>
<s2>Kyoto</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<wicri:noRegion>ATR Spoken Language Translation Research Labs</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Paliwal, Kuldip K" sort="Paliwal, Kuldip K" uniqKey="Paliwal K" first="Kuldip K." last="Paliwal">Kuldip K. Paliwal</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>ATR Spoken Language Translation Research Labs</s1>
<s2>Kyoto</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<wicri:noRegion>ATR Spoken Language Translation Research Labs</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><inist:fA14 i1="02"><s1>School of Microelectronic Engineering, Griffith University</s1>
<s2>Brisbane</s2>
<s3>AUS</s3>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>Australie</country>
<wicri:noRegion>Brisbane</wicri:noRegion>
</affiliation>
</author>
<author><name sortKey="Nakamura, Satoshi" sort="Nakamura, Satoshi" uniqKey="Nakamura S" first="Satoshi" last="Nakamura">Satoshi Nakamura</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>ATR Spoken Language Translation Research Labs</s1>
<s2>Kyoto</s2>
<s3>JPN</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>Japon</country>
<wicri:noRegion>ATR Spoken Language Translation Research Labs</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Speech communication</title>
<title level="j" type="abbreviated">Speech commun.</title>
<idno type="ISSN">0167-6393</idno>
<imprint><date when="2004">2004</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Speech communication</title>
<title level="j" type="abbreviated">Speech commun.</title>
<idno type="ISSN">0167-6393</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Additive noise</term>
<term>EM algorithm</term>
<term>Noise reduction</term>
<term>Non stationary process</term>
<term>Parameter estimation</term>
<term>Sequential estimation</term>
<term>Speech recognition</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance parole</term>
<term>Estimation paramètre</term>
<term>Estimation séquentielle</term>
<term>Réduction bruit</term>
<term>Algorithme EM</term>
<term>Processus non stationnaire</term>
<term>Bruit additif</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">In this paper, a noise adaptive speech recognition approach is proposed for recognizing speech which is corrupted by additive non-stationary background noise. The approach sequentially estimates noise parameters, through which a non-linear parametric function adapts mean vectors of acoustic models. In the estimation process, posterior probability of state sequence given observation sequence and the previously estimated noise parameter sequence is approximated by the normalized joint likelihood of active partial paths and observation sequence given the previously estimated noise parameter sequence. The Viterbi process provides the normalized joint-likelihood. The acoustic models are not required to be trained from clean speech and they can be trained from noisy speech. The approach can be applied to perform continuous speech recognition in presence of non-stationary noise. Experiments conducted on speech contaminated by simulated and real non-stationary noise show that when acoustic models are trained from clean speech, the noise adaptive speech recognition system provides improvements in word accuracy as compared to the normal noise compensation system (which assumes the noise to be stationary) in slowly time-varying noise. When the acoustic models are trained from noisy speech, the noise adaptive speech recognition system is found to be helpful to get improved performance in slowly time-varying noise over a system employing multi-conditional training.</div>
</front>
</TEI>
<affiliations><list><country><li>Australie</li>
<li>Japon</li>
</country>
</list>
<tree><country name="Japon"><noRegion><name sortKey="Kaisheng Yao" sort="Kaisheng Yao" uniqKey="Kaisheng Yao" last="Kaisheng Yao">KAISHENG YAO</name>
</noRegion>
<name sortKey="Nakamura, Satoshi" sort="Nakamura, Satoshi" uniqKey="Nakamura S" first="Satoshi" last="Nakamura">Satoshi Nakamura</name>
<name sortKey="Paliwal, Kuldip K" sort="Paliwal, Kuldip K" uniqKey="Paliwal K" first="Kuldip K." last="Paliwal">Kuldip K. Paliwal</name>
</country>
<country name="Australie"><noRegion><name sortKey="Paliwal, Kuldip K" sort="Paliwal, Kuldip K" uniqKey="Paliwal K" first="Kuldip K." last="Paliwal">Kuldip K. Paliwal</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Asie/explor/AustralieFrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 00AB73 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 00AB73 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Asie
   |area=    AustralieFrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:04-0276268
   |texte=   Noise adaptive speech recognition based on sequential noise parameter estimation
}}

This area was generated with Dilib version V0.6.33.
Data generation: Tue Dec 5 10:43:12 2017. Site generation: Tue Mar 5 14:07:20 2024

	Serveur d'exploration sur les relations entre la France et l'Australie
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur les relations entre la France et l'Australie

Noise adaptive speech recognition based on sequential noise parameter estimation

Noise adaptive speech recognition based on sequential noise parameter estimation

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri